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A video system is disclosed in which a single generic MPEG 
standard encoder (107) is used to simultaneously code and 
compress plural different resolution video signals from a 
single input video signal; and in which a single generic 
MPEG standard decoder (402) is used to simultaneously 
decode plural coded and compressed video signals of dif- 
ferent resolutions and form a single composite video signal. 
The coder converts each frame of pixel data of the input 
video signal into plural frames having different resolutions, 
which are then combined into a common frame (106) for 
input to the generic MPEG encoder. The MPEG encoder 
produces a single coded and compressed output bitstream in 
slices of macroblocks of pixel data, which output bitstream 
is demultiplexed (108) into separate resolution bitstreams 
using Slice Start Code identifiers associated with each slice 
and Macroblock Address Increments associated with the first 
macroblock in each slice, to properly route each slice to the 
appropriate output The decoder processes (405) the slices 
within the coded and compressed bitstreams of different 
resolutions received from plural sources using the Slice Start 
Codes and Macroblock Address Increments of each slice to 
produce a single composite bitstream of successive slices. 
By merging the slices from the plural sources into the 
composite bitstream in a predetermined manner, the generic 
MPEG decoder produces a digital output video signal that is 
a composite of the different resolution input video signals! 

22 Claims, 5 Drawing Sheets 
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MULTIPLE RESOLUTION, MULTI-STREAM of following pixels for which the data is identical, or 

VIDEO SYSTEM USING A SINGLE transmitting, or storing, only the difference between adjacent 

STANDARD DECODER pixels. Presently, spatial correlation is exploited by com- 
pression techniques using discrete cosine transform and 

CROSS REFERENCE TO RELATED 5 quantization techniques. Where such data compression or 

APPLICATIONS coding is employed, each video source must be equipped 

with data compression equipment and each video receiver 

This application describes and claims subject matter that must likewise be equipped with decoding equipment Sev- 

is also described in our co-pending United States patent eral video coding protocols are well-known in the art, 

application also assigned to the present assignee hereof and including JPEG, MPEG1, MPEG2 and Px64 standards, 

filed simultaneously herewith: "MULTIPLE In a multipoint video application, such as a video 

RESOLUTION, MULTI-STREAM VIDEO SYSTEM teleconference, a plurality of video sequences from a plu- 

USING A STANDARD CODER", Ser. No. 08/499,700. rality of sources are displayed simultaneously on a video 

screen at a receiving terminal. In order to display multiple 

TECHNICAL FIELD windows, the prior art generally required multiple decoding 

This invention relates to the decoding of video signals, devices to decode multi P lc vidco from mc 

and more particularly to the combination of multiple coded multiple sources. At present, multiple decoder devices are 

video signals having different resolutions into a single video expensive, and therefore an impractical solution for creating 

signal using a single standard video decoder. multiple video windows. 

20 A further difficulty encountered in multiple window video 

BACKGROUND OF THE INVENTION is that many sources provide video in only one screen 

- . display size. In fact, many sources transmit only full screen 

The acceptance of digital video compression standards wh ich typically comprise 640x480 pixels per frame, 

for example, the Motion Picture Expert Group (MPEG) Tq provide flexible windowing capabilities, different 

standard, combined with the availability of a high- users ghould ^ optioQ of invoking md viewing 

bandwidth communication infrastructure have poised the size d windows of the same video. Windows 

telecommunications market for an explosion of video based whlch con^e a fraction of the entire display require the 

services, Services such as video-on-demand, multi-party data to be mtered and subsampled, resulting k frame 

interactive video games, and video teleconferencing are $igDals cojnprising less pixelSi u & therefore advantageous 

actively being developed These and other future video to mhk£ yidc(> ^ available at a pi^ty of window sizes 

services will require a cost-effective video composition and QJ resolutioa tevds> For example, the video of a participant 

display technique. in a teleconference may be made available at full screen 

An efficient multiple window display is desirable for resolution, Va screen, Via screen or Vfe* screen, so that the 

displaying the multiple video sequences produced by these omer participants can choose a desired size window in 

applications to a Yideo user or consumer. The implementa- M which to view the transmitting participant Other examples 

tion of such a windows environment would permit a user to M wn ich it would be advantageous to generate multiple 

simultaneously view several video sequences or images resolution video signals would be picture-in-picture for 

from several sources. The realization of a commercial mul- digital TV in which a user would receive signals from plural 

tiple window video display is hampered by technological sources at only the resolutions necessary to fill a selected 

limitations on available data compression equipment. ^ image size. Similarly, a video server might output multiple 

In digital television and other digital image transmission resolution streams to enable a user to display images from 

and storage applications, image signals must be compressed multiple sources in different windows. Each window 

or coded to reduce the amount of bandwidth required for requires less than full resolution quality. Thus, by transmit- 

transmission or storage. Typically, a full screen frame of ting to the user only that bitstream associated with the size 

video may be composed of an array of at least 640x480 45 of the image requested to be displayed rather than a full 

picture elements, or pixels, each pixel having data for resolution bitstream, substantial bandwidth can be saved as 

luminance and chrominance. A video sequence is composed can the processing power to decode the full-resolution 

of a series of such discrete video frames, similar to the bitstream and to scale the resulting video to the desired less 

frames in a moving picture film. True entertainment quality than full resolution image size. 

video requires a frame rate of at least thirty frames per ^ Under one technique of providing multiple resolution 
second Uncompressed, the bit rate required to transmit levels, each video transmitter provides a plurality of video 
thirty frames per second would require far more bandwidth sequence, each independently containing the data signal for 
than is presently practical. a particular resolution level of the same video image. One 
Image coding techniques serve to compress the video data method of generating multiple resolution video sequences 
in order to reduce the number of bits transmitted per frame. 55 would be to employ several encoders, one for each resolu- 
There are several standard image coding techniques, each of tion level. The requirement of multiple encoders, however, 
which takes advantage of pixel image data repetition, also as in the case of decoders, increases system cost since 
called spatial correlation. encoders comprise costly components in digital video trans- 
Spatial correlation occurs when several adjacent pixels mission systems, 
have the same or similar luminance (brightness) and chiomi- 60 The inventors of the present invention are co-inventors, 
nance (color) values. Consider, for example, a frame of together with G. L. Cash and D. B, Swicker, of co-pending 
video containing the image of a blue sky. The many pixels patent application, Ser. No. 08/201,871, filed Feb. 25, 1994 
coinprising the blue sky image will likefy have identical or now U.S. Pat No. 5,481,297. In that application, a multi- 
near identical image data. Data compression techniques can point digital video communication system is described 
exploit such repetition by, for example, transmitting, or 65 which employs a single standard encoder, such as JPEG or 
storing, the luminance and chrominance for data for one MPEG, to encode multiple resolution video signals derived 
pixel and transmitting, or storing, information on the number from a full resolution video signal, and a single standard 
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decoder, such as JPEG or MPEG, to decode and display 
multiple resolution video signals. In that system, macrob- 
locks of a sampled full resolution video signal and macrob- 
locks of a subsampled input video signal at multiple different 
fractional resolutions are multiplexed into a single stream 5 
before being fed to the single standard video encoder, which 
encodes or compresses each macroblock individually. 
Because MPEG-based standard compression systems 
employ interframe coding in which the encoder relies on 
information from previous (and in some cases future) 10 
frames, a reference frame store must provide separate ref- 
erence frame information to the encoder for each resolution. 
Thus, control logic is necessary to change the reference 
frame buffer as well as the resolution related information in 
accordance with each macroblock's resolution as it is pro- 15 
cessed by the encoder. Similarly, at the decoder, before 
decoding macroblocks from different resolution sources the 
decoder needs to be context switched and information from 
a previous (and in some cases a future) frame must be 
provided in the resolution associated with the macroblock. 20 
The standard encoder and decoder must, therefore, operate 
cooperatively with complex circuitry to provide the neces- 
sary context switching functionality. Furthermore, since 
context switching need be performed on a macxoblock-by- 
macroblock basis, substantial computational overhead is 25 
required to enable individual marcroblocks to be processed 
separately. 

An object of the present invention is to combine and 
decode multiple resolution coded and compressed video 
input data streams into a single video output signal using a 30 
single standard decoder without the complexity of context 
switching. 

SUMMARY OF THE INVENTION 

In accordance with the present invention, plural input 35 
bitstreams representing coded and compressed pixel data 
from frames of associated input image signals, such as video 
signals, are simultaneously decompressed and decoded to 
form a single output bitstream representing decoded and 
decompressed pixel data in a frame of an output image 40 
signal that is a composite of the frames of the input image 
signals. When each coded and compressed input bitstream is 
coded in successive segments in which each successive 
segment is identtfiably associated with a predetermined part 
of the associated input image without fully decoding the 45 
input bitstream, segments from each of the plural input 
bitstreams can be multiplexed in the coded and compressed 
domain so as to create a combined bitstream of successive 
segments of coded and compressed pixel data that represents 
a composite of the plural input image signals. A single so 
decoder which decompresses and decodes successive seg- 
ments of an image signal can thus be used to form the output 
bitstream of decoded and decompressed pixel data of the 
composite image. 

More particularly, in accordance with the video decoding 55 
system of the present invention, a single generic MPEG 
decoder is used to decode and decompress plural input 
bitstreams representing coded and compressed pixel data 
from frames of plural input video signals of plural input 
images A single output bitstream is formed representing 60 
pixel data in flames of an output video signal that is a 
composite of the frames of the plural input video signals. 
Each coded and compressed input bitstream has been coded 
and compressed in a frame divisible format of successive 
slices which represent the coded and compressed pixel data 65 
in one or more macroblocks of pixels in a frame of the 
associated input signal, wherein each slice has an i dentin"- 



768 

4 

able Slice Start Code (SSC) which identifies a row of the 
slice in the frame of the associated input video signal, and 
the first macroblock in each slice has a Macroblock Address 
Increment (MAI) which identifies the position of that first 
macroblock relative to a fixed position in the frame and 
which can be retrieved from the coded and compressed pixel 
data in the slice. 

The decoding system of the invention combines, in the 
coded and compressed domain, frames from the plural input 
signals to form a single combined bitstream that can be 
decoded by a standard MPEG decoder to produce a com- 
posite frame comprising frames of the plural inputs. The 
combined bitstream is formed by storing the coded and 
compressed pixel data in each input bitstream representing 
one frame and multiplexing slices of the stored data from 
each input bitstream in a predetermined manner so that the 
resultant combined bitstream of successive slices represents 
the coded and compressed pixel data in a composite frame. 
In combining the slices from the plural input bitstreams, the 
SSC of each slice is renumbered as necessary in the com- 
bined bitstream according to the row of the pixel data 
associated with the slice in the composite frame and the MAI 
of each slice is renumbered as necessary according to the 
relative position of the first macroblock of the slice in the 
composite frame. The resultant combined bitstream is input- 
ted to the standard MPEG' decoder, which is blind to the 
composite nature of its input The MPEG decoder decom- 
presses and decodes the combined bitstream to form a 
composite frame of pixel data which can be decoded and 
displayed as a composite image. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a block diagram of an encoding system that 
employs a single standard MPEG coder for simultaneously 
generating multiple resolution compressed video data 
streams from a single video input; 

FIG. 2 shows an arrangement for inputting pixel data from 
a plurality of different resolution inputs to the standard 
MPEG coder in the encoding system of FIG. 1; 

FIG. 3 is a flow chart showing the processing steps used 
by a rrogrammable signal processor, at the output of MPEG 
coder in the encoding system of FIG. 1, for generating the 
multiple resolution compressed video data streams; 

FIG. 4 is a block diagram of a decoding system in 
accordance with the present invention that employs a single.' 
standard MPEG decoder for compositing multiple resolution 1 
compressed video data streams into a single image; and 

FIG. 5 is a flow chart showing the processing steps used 
by a programmable signal processor at the input of the 
MPEG decoder in the decoding system of FIG. 4. 

DETAILED DESCRIPTION 

By exploiting the interframe redundancy in video signals 
through motion estimation, coders build in accordance with 
MPEG and MPEG-2 coding standards (see e.g., D. Le Gall, 
"MPEG: A Compression Standard for Multimedia 
Applications " Communications of the ACM> Volume 34, 
Number 4, April 1991, pp. 46-58; and "Generic Coding of 
Moving Pictures and Associated Audio Information: Video " 
Recommendation ITU-T H.262, IAO/IED 13818-2, Draft 
International Standard, November 1994) achieve a high 
degree of compression. Specifically, MPEG encoders under 
both standards are based on discrete cosine transform (DCT) 
processing that operates on macroblocks of pixels of size, 
for example, of 16x16 for the luminance component of the 
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video signal. For each macrobiock in a current video frame 
to be coded, a "closest" macrobiock in a previously coded 
frame is located and a motion vector of the spatial movement 
between the macrobiock in the current block and the closest 
macrobiock the previous frame is determined. Ptxel-by- 5 
pixel differences between the current macrobiock and the 
closest macrobiock, are transformed by DCT processing in 
each block of 8x8 pixels within the macrobiock and the 
resultant DCT coefficients are quantized and variable-length 
entropy coded and transmitted together with the motion iq 
vectors for the macrobiock. Considerable data compression 
can be achieved using the MPEG coding standards. 

In coding and compressing a video sequence using the 
MPEG standards, the beginning of each coded and com- 
pressed video sequence, the beginning of a group of a 15 
predetermined number of coded and compressed video 
frames, and the beginning of each coded and compressed 
video frame are coded and delineated in the resultant bit- 
stream with identifiable headers. Further, groups of one or 
more macroblocks within the same horizontal row, called 
slices, are processed together to produce a variable-length 
coded data string associated with the macrobiock or mac- 
roblocks of pixels in the slice. Each horizontal row of 
macroblocks across the coded and compressed video frame 
thus consists of one or more slices. Each slice can be located 
in the bitstream of data for the frame through identifiable 
byte aligned Slice Start Codes (SSCs) which both identify 
the start of each slice and the vertical position of the slice 
with respect to the top of the video frame. The slices are thus 
numbered from 1 et seq., with all slices derived from the 
same horizontal row of macroblocks having the same slice 
number. A slice is then the smallest unit of data that can be 
identified in the coded and compressed bitstream associated 
with a frame of data without decoding the bitstream. The 
number of macroblocks in each slice on any given horizontal 
row is a programmable parameter that vary within each row 
of macroblocks and from row to row. Also associated with 
each macrobiock, in accordance with the MPEG standard, is 
a Macrobiock Address Increment which represents the mac- 
robiock 1 s position in the video frame relative to the begin- 
ning of the slice. The first Macrobiock Address Increment in 
each slice, however, represents the address of the first 
macrobiock in the slice relative to the first macrobiock in the 
upper left-hand corner of the video frame. Since it is 
associated with the first macrobiock in each slice, the 
Macrobiock Address Increment for the first macrobiock in 
each slice is readily locatable in the variable-length coded 
bitstream of data for the slice and thus can be retrieved and 
decoded without decoding the bitstream. The positions of 
the other Macrobiock Address Increments in the bitstream of 
data for each slice vary in accordance with the variable- 
length coded data and therefore can not be retrieved without 
decoding the bitstream. These other Macrobiock Address 
Increments are referenced to the first macrobiock in the slice 
of which it is part of. 

In order to produce multiple resolution coded video 
bitstreams from one input video signal with a single MPEG 
standard encoder, and to produce from multiple resolution 
coded video inputs a collage or windowed video display 
with a single MPEG standard decoder, several factors must 
be considered. Because of interframe coding, MPEG uses 
three different frame types: intra (I) frames, predictive (P) 
frames, and bidirectional (B) frames. I frames are coded 
solely based on the spatial correlation that exists within a 
single frame and do not depend on other frames. As such, I 
frames can be decoded independently. I frames are generally 
transmitted upon a scene change when there is little or no 



6 

correlation between successive frames, and periodically 
every fixed number of frames. P frames are predicted based 
on a previous frame and B frames are predicted based on 
past and future frames. In implementing a multiple video 
system with MPEG, the motion estimation and different 
frame types cause the following problems. Firstly, if a 
collage of pictures is provided as input to an MPEG encoder, 
the motion estimation algorithm may incorrectly use parts 
from one picture in estimating the motion of blocks from 
another picture, which therefore prevents their independent 
use at a decoder; secondly, a decoder cannot decode seg- 
ments from different frame types mixed in the same frame; 
and thirdly, context switch, i.e. changing the state informa- 
tion for encoding or decoding from different sources using 
a single encoder or decoder is complicated. The encoding 
and decoding systems described herein below overcome 
these difficulties by utilizing a single generic MPEG encoder 
to generate multiple independent resolution MPEG syntax 
video data streams simultaneously from a single source of 
20 video and, correspondingly, by utilizing a single generic 
MPEG decoder for simultaneously decoding such streams 
received from multiple sources. 

The encoder of the present invention can be used with any 
video source type, examples of which are: NTSC, PAL, 
25 SECAM or a progressively scanned source. For purposes of 
illustration only, it will be assumed that the input signal is an 
NTSC signal, which comprises two interlaced fields per 
video frame, at a frame rate of 30 frames per second, each 
frame comprising 525 scan lines. With reference to the 
30 encoding system of the present invention in FIG. 1, each 
frame of the video input signal on 101 is digitized by a 
conventional, well known in the art, analog-to-digital con- 
verter 102 which digitizes the analog input signal into a 
digital signal of 640x480 pixels. The frame is separated into 
35 its two separate component fields 103 and 104, each having 
a resolution of 640x240 pixels. For an NTSC input, this is 
a trivial operation since the video signal is already divided 
into two fields per frame. Filtering and subsampling cir- 
cuitry 105 horizontally filters and subsamples field 1 to 
40 produce a 320x240 pixel picture and then horizontally and 
vertically filters and subsamples that picture to produce a 
160x112 pixel picture. Field 2 is similarly horizontally 
filtered and subsampled to produce a 320x240 pixel picture, 
and then horizontally and vertically filtered and subsampled 
45 again to produce an 80x48 pixel picture. Filtering and 
subsampling circuit 106 is a conventional circuit, well 
known in the art, which includes digital anti-aliasing filters 
for low-pass filtering the input signal for purposes of remov- 
ing high frequency components that could otherwise corrupt 
50 a subsampling operation. As previously described, the 
MPEG encoder processes the pixel data in block format, 
which for way of example, comprises 16x16 pixels per 
macrobiock for the luminance portion of the video signal. In 
order to define each resolution picture along macrobiock 
55 boundaries, the resolution of each picture is chosen to be in 
integral multiples of 16 in both the horizontal and vertical 
directions 

The four subsampled component fields are stored in a 
logical flame buffer 106, which comprises 640x512 pixel 
60 storage locations. The subsampled fields are separated from 
each other within buffer 106, however, by "guard bands", 
shown in FIG. 1 by the cross-hatched area. As the pixel data 
is fed to a generic MPEG encoder 107 from buffer 106, the 
guard bands prevent one picture in the buffer from being 
65 used in conjunction with an adjoining picture as morion 
compensation processing searches for a best matching mac- 
robiock and computes motion vectors therefrom. By filling 
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the guard bands with a pattern that is not likely to be found As previously described, the MPEG encoder 107 groups 
in a normal video sequence, such as a pixel based checker macroblocks into slices which can be identified in the 
board pattern, the motion estimation algorithm will never resultant compressed data stream with Slice Start Codes that 
pick a matching block from the guard band area, thereby both delineate the beginning of each slice and which indicate 
ensuring that motion estimation for each individual rcsolu- 5 the vertical position of the slice relative to the top of the 
tion picture is limited to its own perimeter-defined area. coded Image. As previously noted, the slice length is an 
As has been discussed, the MPEG encoder processes each adjustable programmable parameter which can vary line- 
input video frame in macroblocks and slices. As will be by-lme and within each row throughout the entire composite 
discussed, the MPEG standard encoder processes the pixel image presented to the encoder from buffer 106. By limiting 
data in the logical frame buffer 106 horizontally in slice 10 the length of a shce along aU horizontal rows that encompass 
format. Thus, the horizontal edges of each of the four more than one individual image in frame buffer 106 to be no 
resolution component pictures stored in the logical frame longer than the shortest width picture, and by placing the 
buffer 106 are along slice boundaries, and thus also, mac- vertical edge of each resolution picture on a slice boundary, 
roblock defined horizontal boundaries, and the vertical the coded data associated with each resolution picture can be 
edges of each resolution picture are along macroblock ^ demultiplexed at the slice level, without needing to decode 
defined vertical boundaries. Thus, for macroblocks defined the compressed data stream, 

as 16x16 pixels (for luminance), the horizontal edge of each FIG. 2 shows frame buffer 106 divided into 32 rows for 

component picture is along a slice row of pixels that is an processing by encoder 107. As previously noted, the Slice 

integral multiple of 16, and the vertical edge of each Start Code (SSC) of each slice along each row is the same, 

component picture is along a macroblock edge of pixels that 20 The Slice Start Codes for the slices in the 320x240 (field 1) 

is also an integral multiple of 16. Further, to vertically resolution picture are numbered 1-15 and the Slice Start 

separate field 1 and field 2 to prevent motion estimation Codes for the slices in the 320x240 (field 2) resolution 

within one of the pictures from using the other, the guard picture are numbered 1S-32. The 160x112 (field 1) resolu- 

band consists of two slice rows, or 32 pixels high. tion picture consists of seven slices having SSC's 4-10, and 

The contents of me logical frame buffer 106 can be 25 toe 80x48 (field 2) resolution picture consists of three slices 

outputted in raster-order to the generic MPEG encoder 107. having SSC s 23-25. By way of example, the slice length of 

Although shown in FIG. 1 as an actual physical frame buffer each slice in each row is chosen to be 80 pixels, or five 

of pixel size 640x5 12 (32 slices high), logical frame buffer macroblocks. By so selecting the slice length, the 160x112 

106 represents in actuality a time arrangement for presenting (field 1) picture begins in the horizontal direction at pixel 
data from the four resolution pictures to the MPEG encoder 30 number 400 so as to be placed at a slice boundary and the 

107 so that the encoder 107 can produce four independent 80x48 (field 2) picture begins in the horizontal direction at 
output coded video bitstreams. Thus frame buffer can be a pixel number 480. The compressed bitstream produced by 
640x5 12 pixel frame storage device, or it can be a storage encoder 107 having slices which are identifiably attributable 
device of smaller size that has sufficient storage space to to each of the component resolution pictures can be demul- 
stcre the processed filtered and subsampled data as outputted 35 tiplexed into four separate resolution bitstreams. 

by filtering and subsampling circuit 105, which are then The compressed composite output bitstream of encoder 

provided, as needed, as an inputs to the generic MPEG 107 is inputted to a programmable digital signal processor 

encoder 107. In order to be used with a multistream decoder, 108. Since the Slice Start Codes are uniquely decodable and 

described hereinafter, MPEG encoder 107 must code each byte aligned in the encoded bitstream, their identification is 

frame as a predictive frame (F-frame) using the same 40 straightforward. The slices belonging to each different reso- 

quantization matrix for each sequence. Individual slices or lution picture are thus stripped into four independent 

macroblocks within these, however, can be coded as intra (I) streams, with the slices associated with the "data" in the 

whenever needed. Quantization factors can also be adjusted slices within the guard band in frame buffer 107 being 

at the macroblock level. This way a decoder can mix slices deleted. In forming each independent bitstream for each 

from different streams under a single frame. By restricting 45 resolution picture for output on output leads 109-112, 

the motion compensation search range to the size of the however, the Slice Start Codes for certain resolution pictures 

guard bands around a picture, the MPEG coder 107 produces need to be renumbered. Thus, for the 160x112 (field 1) 

a single bitstream which contains the compressed video for picture, the Slice Start Codes shown in FIG. 2 as being 

the four resolutions, 320x240 (field 1), 160x112 (field 1), numbered 4-10, are renumbered 1-7, respectively. 

320x240 (field 2), and 80:48 (field 2). The slice size, motion 50 Similarly, for the 320x240 (field 2) picture, the Slice Start 

estimation range and * 4 P-frame coding only" are all pro- Codes numbered 1&-32 are renumbered 1-15, respectively, 

grammable parameters for "generic** MPEG encoders. The Slice Start Codes numbered 23-24 in the 80x48 (field 

The pixel data in frame buffer 106 is fed pixel-by-pixel to 2) picture are renumbered 1-3, respectively. Because of its 

encoder 107, which processes the data in 16x16 pixel position in the composite frame, the Slice Start Codes 1-15 

macroblocks (for the luminance component) and encodes 55 for the 320x240 (field 1) picture do not need to be renum- 

each macroblock and group of macroblocks along a com- bered. 

mon horizontal row (the slices). Encoder 107 is essentially The horizontal position of a slice is deterroined from the 

**blind" to the fact that the input being provided to it consists address of the first macroblock, which can't be skipped (ie., 

of plural images at different resolutions rather than a single always included as a coded block) according to MPEG 

higher resolution image. As aforenoted, each component 60 standards. The address for the first macroblock of a slice is 

image is separated from one another by a guard band a function of the previous slice number and the number of 

containing a pattern unlikely to appear in any image. macroblocks per slice. As previously noted, the macroblock 

Encoder 107, when comparing the data presented to it from address indicator (MAI) is referenced to the first macroblock 

the current frame as stored in buffer 107, with the data from in the upper left-hand corner of the picture. For an input 

a previous frame as is stored in its own internal buffer, will 65 frame described above comprising 640x512 pixel locations, 

therefore not likely find a matching block in its search range there are equivalendy 40x32=1280 macroblocks. In forming 

from anywhere other than its own resolution picture. the four separate resolution bitstreams, in addition to the 
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Slice Start Code renumbering that processor 108 must effect 
described above, the MAI associated with the first macrob- 
lock in each slice in each stream is also likely to require 
changing to properly reference each slice to beginning of its 
new lower resolution picture. Thus, for example, in forming 
the 320x240 (field 1) bitstream, the MAI in each slice 
having SSC=2 in the 320x240 (field 1) picture is decreased 
by 20, the MAI in each slice having SSC=3 in this same 
picture is decreased by 40, the MAI in each slice having 
SSC=4 is decreased by 60, etc., so that the resultant MAIs 
in this bitstream are properly referenced to the 320x240 
resolution size, which contains only 300 macroblocks. The 
MAIs in the other resolution bitstreams are similarly renum- 
bered in accordance with their position in frame buffer 107 
and their particular resolutions. 

As previously discussed, the location of the MAI in each 
slice is readily locatable since it is very close to the Slice 
Start Code. Accordingly, it can be located and renumbered 
without decoding the slice bitstream. This MAI is variable 
length coded and therefore not byte aligned. Frequently, 
therefore, renumbering of this address necessitates bit level 
shifts for the rest of the slice to ensure that all succeeding 
slices in the demultiplexed bitstream remain byte aligned 
Thus, binary *0's are added to the end of the slice's 
bitstream, where needed, to compensate for renumbering the 
slice* s MAL 

The processing steps required of processor 108 to form 
the separate multiple resolution outputs are shown in FIG. 3. 
At step 301 the appropriate high level data is prepared for 
each of the multiple resolution bitstreams to be outputted. 
This high level data (HLD) includes the video sequence 
header, the GOP (group-of-picture) headers, and the picture 
level headers. This data is stored for both present and future 
placement in each component bitstream. At step 302 the 
necessary HLD is inserted into each output bitstream. At 
step 303 the first SSC is located in the output composite 
bitstream of encoder 107. At step 304 the current slice is 
classified. By examining both the SSI and the MAI or the 
first macrobloclc and from a known pattern that relates SSIs 
and MAIs to the separate resolution pictures or to the guard 
band, each slice is associated with either one of the output 
bitstreams being formed or to (he guard band. At decision 
step 305, if the slice is a guard band slice all its bits are 
deleted (step 306) until the next SSC. If not a guard band 
slice, the SSC and Macroblock Address Increments are 
renumbered in accordance with the output stream to which 
the slice is directed (steps 307 and 308). The slice is then 
routed to the appropriate output stream (step 309) and byte 
aligned, where necessary to compensate for changes in the 
slice length due to the replacement MAI (step 310). All the 
slices in the output stream from encoder 107 are thus 
sequentially processed, their SSIs and MAIs renumbered as 
necessary, and directed to their appropriate output bit- 
streams. When the entire frame of data has been processed, 
the necessary high level data is reinserted into each output 
bitstream (step 311) and the next sequential frame is pro- 
cessed. 

As is apparent, by preprocessing and post-processing the 
video bitstream inputted to the generic MPEG encoder 107, 
multiple resolution output bitstreams are produced without 
in anyway needing to modify the encoder itself. The com- 
pressed coded multiple resolution video bitstreams on out- 
puts 109-111 can be transmitted over a network for selective 
reception, or stored in a video storage medium for later 
individual retrieval. 

FIG. 4 illustrates a decoder capable of decoding and 
compositing several multiple resolution streams generated 



,768 
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by either the above described encoding process or any other 
MPEG format coding process. This decoder incorporates a 
standard generic MPEG decoder to decode and composite 
these plural component multiple resolution video signals 

5 These coded video signals would likely originate from 
different sources so that the decoded composited video 
image will be a collage of multiple images having different 
sizes on the receiver's video display device. The key to 
transforming a generic MPEG decoder into a multiple video 

io decoder is the partitioning of the internal frame buffer 401 
used by the generic MPEG decoder 402. As shown, it is 
possible to place several lower resolution component pic- 
tures within frame buffer 401, which normally holds a single 
640x480 resolution frame of video. This is accomplished by 

15 presenting the slices from each of the coded input signals in 
a predetermined order to the decoder 402 that mirrors the 
order of slices in the partitioned frame buffer 401. 
Simultaneously, the Slice Start Codes and Macroblock 
Address Increments of the presented slices are renumbered, 

20 where necessary, in accordance with their position in the 
partitioned frame buffer. Once frame buffer 401 is 
partitioned, a decoded displayed image will be a single 
image that comprises the plural lower resolution input 
images in the partitioned format of the frame buffer 401. 

25 In the example shown in FIG. 4, frame buffer 401 is 
capable of holding three 320x240 pixel resolution images, 
three 160x112 pixel resolution images and four 80x48 pixel 
images. The up to ten MPEG coded input bitstreams asso- 
ciated with these images and received on inputs 403-1-403- 

30 10 are inputted to line buffers 404-1-^404-10, respectively. 
The incoming bitstreams must first be buffered because the 
slice processing requires access to the input slices in order 
of their placement within the frame buffer 401, and the input 
sequence of the multiple input bitstreams cannot be readily 

35 controlled from geographically distributed sources. Also, 
with most currently available MPEG decoders, a complete 
compressed frame must be provided without any breaks in 
the input dam. 
The buffered input streams are applied to slice processor 

40 405, which processes the slices to create a multiplexed 
bitstream that places each slice in a preaVtermined location 
in the bitstream so that, when input to the internal frame 
buffer 401, each slice will be located in the physical frame 
storage location of the image with which it is associated. 

45 Slice processor 405 can be a digital signal processor or a 
CPU, which operates on the input streams and buffers the 
modified streams. As programmable processor 108 did in the 
encoder of FIG. 1, as described herein above, slice processor 
405 examines the Slice Start Codes of each slice within the 

50 component bitstream and renumbers it according to its 
predetermined position in the composite image that will be 
formed from the combined bitstream. Thus, for example, the 
SSC of each slice from a 320x240 pixel resolution input 
bitstream directed to the 320x240 pixel image area in 

55 location 3 in frame buffer 401 is renumbered from between 
1 and 15, to between 16-30, respectively; for proper place- 
ment As further examples, the SSC of each slice in a 
160x112 pixel resolution input bitstream directed to the 
160x112 pixel image area in location 6 in frame buffer 404 

60 is renumbered from between 1 and 7, to between 23 and 29, 
respectively; and the SSC of each slice in an input bitstream 
directed to the 80x48 pixel image area in location 10 is 
renumbered from between 1 and 3, to between 26 and 28, 
respectively. On the other hand, the SSCs of 320x240 pixel 

65 input bitstreams directed to the 320x240 pixel image areas 
1 or 2 do not need to be renumbered since they remain 
between 1 and 15 in the composite image. 
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As was the case in forming the multiple resolution video each image is shown in its position in frame buffer 401. A 

signals from the single input in the encoder of FIG. 1 more general output can be produced by inputting the output 

described above, the Macroblock Address Increments as so- of decoder 402 to a pixel compositor 407, which consists of 

dated with the first macroblock in each slice generally also an address translator 408, a line segment mapping table 409, 

need to be renumbered in the composite bitstream. Thus, as 5 and a video frame buffer 410. Pixel compositor 407 per- 

previously described, the MAI associated with the first forms the functions of placement and scaling of the inde- 

macroblock in each slice is accessed, read, renumbered, and pendent images within the digital raster. The output of the 

reinserted into the slice so as to properly reference each slice address translator 408 is placed into the video frame buffer 

to the first macroblock in the upper left-hand corner of the 410 for display on either a computer monitor or a television 

composite image rather than to the first macroblock in die 10 monitor. The number of pictures to be placed in the frame 

upper left-hand corner of each individual component image. buffer can be reduced by using "still" pictures for the unused 

As in the case of the encoder, when substituting one MAI locations. 

with another in the variable-length coded bitstream, fill '0' Pixel compositor 407 performs the remapping and hcri- 

bits may need to be inserted at the end of the slice to zontal interpolation function on the raster output of decoder 

maintain byte alignment 15 402. The pixel compositor keeps track of the current line and 

FIG. 5 shows a flow chart of the processing steps of slice pixel within the raster using standard line-counter and 

processor 405 as it processes the buffered component lower pixel-counter circuitry. Using this information, address 

resolution bitstreams on inputs 403-1-403-10. At step 501 translator 408 parses the raster into fixed line segments and 

the necessary high level data for the composite bitstream is using the line segment mapping table 400, redirects a given 

prepared and installed since the high level data indicating 20 ^ ne segment to any other segment within the video frame 

resolution, etc., in each component input bitstream is inap- buffer 410. Also, a given line segment can be expanded 

plicable to the 640x480 resolution of the composite image using linear interpolation within the pixel compositor. The 

bitstream, At step 502 a processing order for the input command to expand a given line segment is part of the line 

bitstreams is established to effect processing of the slices in segment mapping table. 

the order prescribed by the partitioned frame memory 2 s Although is may appear that the system is limited to 

arrangement. At step 503 the current bitstream is set as the displaying only 320x240 pixel resolution streams or less, 

first bitstream to be placed in position 1 in the composite this is not the case. It is possible to combine the two 

image (shown in frame buffer 401 in FIG. 4) and the current 320x240 pixel coded fields from one source into a single 

SSC is also set as 1. The current SSC in the current input 640x480 pixel image by horizontally interpolating and by 

bitstream is then located at step 504, which for the first 30 interlacing the scan lines of the two fields into the video 

bitstream and SSC of 1, is the slice having SSC=1 in the frame buffer. This uses up half the I and P frame memory 

320x240 input 1. At steps 505 and 506 the SSC and MAI are leaving room for two more 320x240 streams or other 

adjusted for that slice, which for that first slice in the first combinations having the equivalent area of 320x240. 

input is not required. The bits in the current slice are then Clearly, if needed, the generic MPEG decoder 401 can also 

sent to the decoder 402 at step 507 after being byte aligned 35 decode a genuine 640x480 resolution picture, 

(step 508), if necessary (not necessary when the SSC and The multiple resolution encoder and multiple resolution 

MAI are not renumbered). At step 509, a decision is made decoder described herein above can interact together with 

what next current slice is to be processed from what current each other or separately. Thus, the separate resolution out- 

bitstream. This decision is based upon both the input bit- puts of the encoder can be decoded either with a standard 

stream and the particular slice just processed, and the order 40 MPEG decoder or with a multiple resolution decoder 

of presentation necessary to effect the desired partitioning of described herein above which incorporates the standard 

frame buffer 401. If the next slice to be processed is from the MPEG decoder. Similarly, the source of the various resolu- 

same frame, determined at step 510 based on the slice and tion inputs to the multiple resolution decoder described 

input bitstream just processed, then steps 504 through 509 herein above can be from separate MPEG encoders, and/or 

are repeated for this next slice within the frame. If the next 45 from multiple resolution encoders of the type described 

current slice to be processed is in the next frame, than at step above. Further, when the described encoder transmits one of 

511 high level data (picture start code, etc.) is inserted into its resolution outputs to the described decoder, a user in 

the bitstream inputted to decoder 4<D2 (step 511) and then viewing the received video signal may decide to increase or 

steps 504 through 508 are repeated. decrease the size of the display window. By transmitting a 

The resultant composite bitstream outputted by slice 50 resolution request signal to the encoder, one of the other 
processor 405 appears for all purposes to MPEG decoder resolution outputs of the multiple resolution encoder can be 
402 as a single resolution bitstream and decoder 402 utilizes transmitted to the user in addition to or in place of the 
the partitioned frame buffer 401 as if it contains a single original resolution video signal. For example, a user viewing 
stream. Thus, the decoder 402 is blind to the composite a 320x240 (field 1) resolution video signal may decide to 
nature of its input and decodes all slices and macroblocks as 55 view a full resolution image. The 320x240 (field 2) image 
if they belong to the same single image. It needs to be noted may then be additionally transmitted, which would be corn- 
that processing of the multiple inputs in this manner is only bined with the 320x240 (field 1) image by horizontal inter- 
possible if all the input streams are encoded using just polation and interlacing to produce a full resolution 640x480 
P-frames. This is because slices from different frame types video signal. Alternatively, any one resolution signal can be 
cannot be mixed. 60 replaced by any other available lower or higher resolution 

The cross-hatched slices at the bottom of the lower signal, 

resolution pictures in frame buffer 401 are for size adjust- Although described in conjunction with an MPEG stan- 

ment and do not require any processing by the decoder. The dard decoder, the present invention can be used with other 

resultant digital raster output from decoder 402 can be types of decoders for simultaneously decoding plural input 

outputted directly to a monitor, which displays the up to ten 65 bitstreams representing coded and compressed pixel data 

separate component images derived from the ten input from frames of associated input images. As long as each 

streams as a composite collage of images. In such a display, input bitstream is coded and compressed in successive 
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segments in which each successive segment is identifiably 
associated with a predetermined part of its input image 
without fully decoding the input hitstream, the input hit- 
streams can be combined in the coded and compressed 
domain to form a frame of a composite image which can 
then be decompressed and decoded by a decoder which 
processes the segments in the combined bitstream as if they 
were associated with a single image. 

The above-described embodiment is illustrative of the 
principles of the present invention. Other embodiments 
could be devised by those skilled in the art without departing 
from the spirit and scope of the present invention. 

The invention claimed is: 

1. A video decoding system for simultaneously decoding 
plural input bitstreams representing coded and compressed 
pixel data from frames of associated plural input video 
signals of plural input images to form a single output 
bitstream representing decoded and decompressed pixel data 
in frames of an output video signal that is a composite of the 
frames of the plural input video signals, each of said coded 
and compressed input bitstreams being coded and com- 
pressed in a frame divisible format of successive slices 
which represent the coded and compressed pixel data in one 
or more macroblocks of pixels in a frame of the associated 
input video signal, each slice having an identifiable delin- 
eated slice start code (SSC) which identifies a row of the 25 
slice in the frame of the associated input video signal, and 
a first macroblock in each slice having an identifiable 
macroblock address increment (MAI) which identifies the 
position of that first macroblock in the slice relative to a 
fixed position in the frame and which can be retrieved from 
the coded and compressed pixel data in the slice in the input 
bitstream, the system comprising: 

a plurality of storage means, each storage means for 
storing a portion of one of said input bitstreams that 
represents the coded and compressed pixel data in a 
frame of the associated input video signal; 
processing means for multiplexing slices of the coded and 
compressed pixel data from the stored portions of the 
input bitstreams in said plurality of storage means in a 
predetermined manner to form a combined bitstream of 40 
successive slices of coded and compressed pixel data 
that represents coded and compressed pixel data in a 
composite frame that is a composite of the frames of the 
input video signals associated with the multiplexed 
input bitstreams, the SSC of each slice being renum- 45 
bered as necessary in the combined bitstream according 
to the row of the pixel data associated with the slice in 
the composite frame and the MAI of each slice being 
renumbered as necessary according to the relative 
position of the first macroblock of the slice in the 50 
composite frame; and 
a single standard decoder which converts the combined 
bitstream of successive slices of coded and compressed 
pixel data associated with the composite frame to a 
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5. The decoding system of claim 4 wherein the different 
resolutions of the plural input video signals are Vi, '/a and 1 /m 
of the resolution of the composite frame. 

6. A method for simultaneously decoding plural input 
bitstreams representing coded and compressed pixel data 
from frames of associated plural input video signals of plural 
input images to form a single output bitstream representing 
decoded and decompressed pixel data in frames of an output 
video signal that is a composite of the frames of the plural 
input video signals, each of said coded and compressed input 
bitstreams being coded and compressed in a frame divisible 
format of successive slices which represent the coded and 
compressed pixel data in one or more macroblocks of pixels 
in a frame of the associated input video signal, each slice 
having an identifiable delineated slice start code (SSC) 
which identifies a row of the slice in the frame of the 
associated input video signal, and a first macroblock in each 
slice having an identifiable macroblock address increment 
(MAI) which identifies the position of that first macroblock 
in the slice relative to a fixed position in the frame and which 
can be retrieved from the coded and compressed pixel data 
in the slice in the input bitstream, the system comprising: 

storing a portion of each of the input bitstreams that 
represents the coded and compressed pixel data in a 
frame of the associated input video signal; 
multiplexing slices of the coded and compressed pixel 
data from the stored portions of the input bitstreams in 
a predetermined manner to form a combined bitstream 
of successive slices of coded and compressed pixel data 
that represents coded and compressed pixel data in a 
composite frame that is a composite of the frames of the 
input video signals associated with the multiplexed 
input bitstreams, the SSC of each slice being renum- 
bered as necessary in the combined bitstream according 
to the row of the pixel data associated with the slice in 
the composite frame and the MAI of each slice being 
renumbered as necessary according to the relative 
position of the first macroblock of the slice in the 
composite frame; and 
converting the combined bitstream of successive slices of 
coded and compressed pixel data associated with the 
composite frame to a decoded and decompressed com- 
posite frame of pixel data using a single standard 
decoder, wherein the decoded and decompressed com- 
posite frame of pixel data can be decoded and displayed 
as a single image comprising plural input images 
associated with the multiplexed input bitstreams. 

7. The method of claim 6 wherein the single standard 
decoder is an MPEG standard decoder. 

8. The method of claim 6 further comprising the step of 
rearranging the plural images in the composite frame. 

9. The method of claim 6 wherein the plural input video 
signals have different resolutions. 

10. The method of claim 9 wherein the different resolu- 



decoded and decompressed composite frame of pixel 55 tions of the plural input video signals are V4, Vs and 1 /m of the 



data, wherein the decoded and decompressed compos- 
ite frame of pixel data can be decoded and displayed as 
a single image comprising plural input images associ- 
ated with the input bitstreams multiplexed by said 
processing means. 

2. The decoding system of claim 1 wherein said single 
standard decoder is an MPEG standard decoder. 

3. Hie decoding system of claim 1 further comprising a 
pixel compositor for rearranging the plural images in the 
composite frame. 

4. The decoding system of claim 1 wherein the plural 
input video signals have different resolutions. 



resolution of the composite frame. 

11. A decoding system for simultaneously decoding plural 
input bitstreams representing coded and compressed pixel 
data from frames of associated input image signals to form 
60 a single output bitstream representing decoded and decom- 
pressed pixel data in a frame of an output image signal that 
is a composite of the frames of the plural input image 
signals, each of said coded and compressed input bitstreams 
being coded in successive segments in which each succes- 
65 sive segment is identifiably associated with a predetermined 
part of the associated input image without fully decoding the 
input bitstream, the system comprising: 
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means for multiplexing segments from each of the plural 
input bitstreams in a coded and compressed domain in 
a predetermined manner to form a combined bitstream 
of successive segments of coded and compressed pixel 
data, the segments being ordered in the combined s 
bitstream in such a manner that the combined bitstream 
represents a coded and compressed single frame that is 
a composite of the frames of the plural input image 
signals; and 

means for decompressing and decoding the successive 10 
segments of the combined bitstream to form a frame of 
pixel data of an image signal that is a composite of the 
frames of the plural input image signals. 

12. The decoding system of claim 11 further comprising 

a pixel compositor for rearranging the images within the 15 
composite output image. 

13. The decoding system of claim 11 wherein the input 
image signals are video signals. 

14. The decoding system of claim 13 wherein the means 
for decompressing and decoding is an MPEG standard 20 
decoder. 

15. The decoding system of claim 11 wherein the input 
image signals have different resolutions. 

16. The decoding system of claim 15 wherein the different 
resolutions of the input image signals are Vi, V* and Ms* of the 25 
resolution of the composite output image. 

17. A method for simultaneously decoding plural input 
bitstreams representing coded and compressed pixel data 
from frames of associated input image signals to form a 
single output bitstream representing decoded and decom- 30 
pressed pixel data in a frame of an output image signal that 

is a composite of the frames of the plural input image 



signals, each of said coded and compressed input bitstreams 
being coded in successive segments in which each succes- 
sive segment is identifiably associated with a redetermined 
part of the associated input image without fully decoding the 
input bitstream, the method comprising the steps of: 
multiplexing segments from each of the plural input 
bitstreams in a coded and compressed domain in a 
predetermined manner to form a combined bitstream of 
successive segments of coded and compressed pixel 
data, the segments being ordered in the combined 
bitstream in such a manner that the combined bitstream 
represents a coded and compressed single frame that is 
a composite of the frames of the plural input image 
signals; and 

decompressing and decoding the successive segments of 
the combined bitstream to farm a frame of pixel data of 
an image signal that is a composite of the frames of the 
plural input image signals. 

18. The method of claim 17 further comprising the step of 
rearranging the images within the composite output image. 

19. The method of claim 17 wherein the input image 
signals are video signals. 

20. The method of claim 18 wherein the step of decom- 
pressing and decoding uses an MPBG standard decoder. 

21. The method of claim 17 wherein the input image 
signals have different resolutions. 

22 The method of claim 21 wherein the different reso- 
lutions of the input image signals are l A, % and of the 
resolution of the composite output image. 
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